Optimizations & Genetic Algorithms

Mozhan Soltani PhD student Software Engineering Research Group

2017-12-20

What is optimization about?

What Is Optimization?

  • Optimization is the process of adjusting the inputs to or characteristics of a device, mathematical process, or experiment to find the minimum or maximum output.
    e.g., adjusting the screen light so it takes as little power as possible to produce sufficient light!

Dynamic Optimization

Optimization depends on secondary variables like time!

examples:
- The fastest way to work may depend on the time of day and weather.
- The cheapest flight ticket depends on the season of the year.

Any other situation you can think of?

Static Optimization

Finding the minimum for \(f(x) = x^2\) does not depend on a secondary variable like time!

It is less complex than dynamic optimization!

Discrete or Continuous Optimization

Discrete variables have a finite number of possible values
e.g.: the order to do a series of tasks on a list to take as little time as possible.
Continuous variables have an infinite number of possible values, like below:
Continuous function

Trial and Error Optimization

refers to the process of adjusting variables that affect the output without knowing much about the process that produces the output.

examples:
adjusting the antenna to get the best reception.

mixing chemicals to find new drugs, e.g. antibiotics.

Natural Optimization Methods

  • Simulated Annealing - SA
    – inspired by the anealing process for material
  • Particle Swarm Optimization - PSO
    – inspired by flocking birds/schooling fish
  • Ant Colony Optimization - ACO
    – inspired by the social behavior of ants
  • Genetic Algorithm - GA
    – inspired by the Darwinian evolution process
  • etc.

Simulated Annealing

Simulates the annealing process:

a substance is heated above its melting temperature and then gradually cooled to produce the crystalline lattice, which minimizes its energy probability distribution.

Simulated Annealing

So lowering the temperature until the material is in a steady frozen state can be preceived as searching for the material state in which the internal particles have the minimum energy.

Simulated Annealing (SA) Algorithm

  1. Set temperature \(T\) to maximum
  2. Generate a random solution
  3. Calculate fitness/cost
  4. Generate a neighbouring solution
  5. Calculate the fitness/cost of the new solution
  6. Compare \(f\)(new) and \(f\)(old)
    if the new solution is better then take it!
    else use an acceptance rate to choose between the two!
  7. Reset \(T\): \(T\) * \(alpha\)
    (normally \(0.8<\)\(alpha\)\(<0.99\))
  8. Repeat 4-7 until:
    either the global optimum is found
    or \(T\) reaches the minimum.

Let’s play a guessing game!

Guess implementation

Generate a random string which has the same size with the target string (password).

class Guess:
    def __init__(self, chars, fitness):
        # a guess is made of characters and has a fitness score
        self.Chars = chars
        self.Fitness = fitness

def generate_guess(length, charSet, get_fitness):
    chars = []
    #pick (length) random elements from the charSet (set of genes)
    chars.extend(random.sample(charSet, length))
    #turn the assembled set of characters into a string
    chars = ''.join(chars)
    fitness = get_fitness(chars)
    return Guess(chars, fitness)

How to get a neighbor solution?

A neighbor solution has a minimal difference with the current one!

def get_neighbor(parent, charSet, get_fitness):
    # get a random index
    index = random.randrange(0, len(parent.Chars))
    #turn the parent characters into a list
    childChars = list(parent.Chars)
    #pick 2 random characters from the possible set of genes
    newChar, alternate = random.sample(charSet, 2)
    #replace the index character with one of the randomly found genes
    childChars[index] = alternate if newChar == childChars[index] else newChar
    chars = ''.join(childChars)
    #get the fitness of the newly generated character set
    fitness = get_fitness(chars)
    return Guess(chars, fitness)

Fitness function and acceptance rate

Reward a guess for every correct character in the right place!

def get_fitness(guess, target):
    #increment the fitness score by 1 for every matching character
    return sum(1 for expected, actual in zip(target, guess)
               if expected == actual)

As the temprature goes down (search progresses), the probability of accepting a worse guess also goes down!

def acceptance_probability(prev_score, next_score, temperature):     
    return math.exp(temperature/(1+temperature))

Implementation of the annealing process

Until the temperature hits the minimum, keep searching for better neighbors, and somtimes allow worse ones to enter the loop.

def anneal(get_fitness, targetLen, optimalFitness, charSet, display, temperature, min_temp, alpha):
    #make the random initial guess
    bestGuess = generate_guess(targetLen, charSet, get_fitness)
    display(bestGuess)
    best_fitness = bestGuess.Fitness
    #If the guess is already perfect then no need to go on with the search!
    if bestGuess.Fitness >= optimalFitness:
        return bestGuess
    #Set the temperatures
    T = temperature
    T_min = min_temp
    #until the temperature reaches the minimum, search!
    while T > T_min:
        new_solution = get_neighbor(bestGuess, charSet, get_fitness)
        new_fitness = new_solution.Fitness
        #If the generated neighbor is already good, return it.
        if new_fitness == optimalFitness:
            return new_solution
        #If the generated neighbor is better, move on to it.
        if new_fitness > best_fitness:
            bestGuess = new_solution
            best_fitness = new_fitness
        else:
            ar = acceptance_probability(best_fitness, new_fitness, T)
            #let the acceptance rate recommend whether to switch to the worse or not.
            if ar > random.uniform(2,3):
                bestGuess = new_solution
                best_fitness = new_fitness
        #the temperature decreases
        T = T*alpha
        display(bestGuess)
    return bestGuess

Let’s now guess a password, using SA!

class GuessPasswordTest(unittest.TestCase):
    # The set of possible genes to choose randomly from.
    charset = " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!.,"
    def test_sa(self):
        # The target string to search for
        target = "Hello World!"
        self.guess_password(target)
    def guess_password(self, target):
        startTime = datetime.datetime.now()
        
        #wrapper for the fitness function to pass to GA
        def fnGetFitness(chars):
            return sa.get_fitness(chars, target)

        #optimal fitness score based on the length of the target string
        optimalFitness = len(target)
        
        #let's search for it
        best = sa.anneal(fnGetFitness, len(target), optimalFitness,
                                self.charset, fnDisplay, 100000.0, 0.00001, 0.99 )
        #The search result must be the same as the target string!
        self.assertEqual(best.Chars, target)

Hill Climbing:

a very similar optimziation algorithm to SA

The difference is that worse neighbours will never get to be selected!

Local Optimum

If hill climbing hits the local optimum, one approach is to restart the search!

Particle Swarm Optimization - PSO

PSO simulates the behavior of bird flocking.
Suppose: a group of birds are randomly searching food in an area. There is only one piece of food in the area being searched. No bird knows where the food is. But they know how far the food is in each iteration. An effective way to find the food is to follow the bird which is nearest to the food.

PSO

PSO is a population based algorithm that uses a number of particles which form a swarm to search for an optimum.

Each particle adjusts their flying based on their own and other particle’s experience until the swarm finally finds the optimum or the search budget is exhausted.

Search budget could be based on time, number of iterations, etc.

PSO Algorithm

  1. Begin with a random population of particles (solutions).
  2. For each particle, calculate the fitness score.
    – Fitness score shows the distance from the optimum
    – If the fitness score is better than the value the row has ever had, update pBest.
  3. Choose the particle with the best fitness score.
  4. For each particle, calculate the velocity, and update the position of the particle accordingly.
  5. repeat 2-4 until global optimum is found, or search budget is exhausted.

How to calculate position and velocity?

\(x\)\(i\)\((k+1)\) \(=\) \(x\)\(i\)\(k\) \(+\) \(v\)\(i\)\((k+1)\)
\(v\)\(i\)\((k+1)\) \(=\) \(w\)\(k\)\(v\)\(i\)\(k\) \(+\) \(c\)\(1\)\(r\)\(1\) (\(p\)\(i\)\(k\) \(-\) \(x\)\(i\)\(k\)) \(+\) \(c\)\(2\)\(r\)\(2\) (\(p\)\(g\)\(k\) \(-\) \(x\)\(i\)\(k\))

where:
\(p\)\(i\)\((k)\) \(=\) best position particle \(i\) has had
\(p\)\(g\)\((k)\) \(=\) best particle position
\(w\)\(k\) \(=\) constant inertia weight
\(c\)\(1\), \(c\)\(2\) \(=\) learning factors (usually \(c\)\(1\) \(=\) \(c\)\(2\) \(=\) \(2\))
\(r\)\(1\), \(r\)\(2\) \(=\) random numbers between \(0\) and \(1\)

Let’s find the minimum for \(f(x) = x^2\) !

We can use the Python library called PySwarms:
https://github.com/ljvmiranda921/pyswarms

SOP example, using PySwarms

Set up the PSO parameters, then call an optimizer.

You could plot the search progress and animate it (optional).

import pyswarms as ps
import matplotlib.pyplot as plt
from pyswarms.utils.functions import single_obj as fx
from pyswarms.utils.environments import PlotEnvironment

#Set-up optimizer
options = {'c1':0.5, 'c2':0.3, 'w':0.9}
optimizer = ps.single.GlobalBestPSO(n_particles=10, 
                                    dimensions=3, 
                                    options=options)
#Initialize plot environment
plt_env = PlotEnvironment(optimizer, fx.sphere_func, 350)
#Plot the cost
plt_env.plot_cost(figsize=(8,6));
plt.show()
#animate the search process
anim = plt_env.plot_particles2D(limits=((-1.2,1.2),(-1.2,1.2)))
anim.save('animation.gif', writer='imagemagick', fps=15)

What does it look like?

What does it look like?

Swarm optimization

More on PSO

For more on PSO, check out the following:

Genetic Algorithms

Inspired by the Darwinian evolution and the concept of survival of the fittest!

Genetic Algorithm Overview

Each point in the search space is an referred to as an individual or chromosome.

Individuals collectively form a population, which is iteratively evolved until the optimum is found or the search budget is exhausted.

Evolutionary Operators in GA

Crossover is used to re-combine the properties from parents.
Mutation is applied to explore the neighborhood!

Let’s play the same guessing game!

What do we need to implement?

  1. Individual (chromosome) generator
  2. Fitness function
  3. Evolutionary operators (only mutation for now)
  4. Search engine
  5. Display function, showing the search progress (optional!)

Individual Generator and Fitness Function

class Chromosome:
    def __init__(self, genes, fitness):
        self.Genes = genes
        self.Fitness = fitness
def generate_parent(length, geneSet, get_fitness):
    genes = []
    while len(genes) < length:
        sampleSize = min(length - len(genes), len(geneSet))
        genes.extend(random.sample(geneSet, sampleSize))
    genes = ''.join(genes)
    fitness = get_fitness(genes)
    return Chromosome(genes, fitness)
def get_fitness(guess, target):
    return sum(1 for expected, actual in zip(target, guess)
               if expected == actual)

Mutation

Similar to the one used in SA!

def mutate(parent, geneSet, get_fitness):
    index = random.randrange(0, len(parent.Genes))
    childGenes = list(parent.Genes)
    newGene, alternate = random.sample(geneSet, 2)
    childGenes[index] = alternate if newGene == childGenes[index] 
                                  else newGene
    genes = ''.join(childGenes)
    fitness = get_fitness(genes)
    return Chromosome(genes, fitness)

Search Engine

def get_best(get_fitness, targetLen, optimalFitness, geneSet, display):
    random.seed()
    bestParent = _generate_parent(targetLen, geneSet, get_fitness)
    display(bestParent)
    if bestParent.Fitness >= optimalFitness:
        return bestParent
    while True:
        child = _mutate(bestParent, geneSet, get_fitness)
        if bestParent.Fitness >= child.Fitness:
            continue
        display(child)
        if child.Fitness >= optimalFitness:
            return child
        bestParent = child

Let’s test it!

class GuessPasswordTest(unittest.TestCase):
    geneset = " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!.,"

    def test_GA(self):
        target = "Hello World!"
        self.guess_password(target)

    def guess_password(self, target):
        startTime = datetime.datetime.now()

        def fnGetFitness(genes):
            return genetic.get_fitness(genes, target)

        def fnDisplay(candidate):
            genetic.display(candidate, startTime)

        optimalFitness = len(target)
        best = genetic.get_best(fnGetFitness, len(target), optimalFitness,
                                self.geneset, fnDisplay)
        self.assertEqual(best.Genes, target)

What was missing in the implementation?

  • implementation of selection
  • implementation of crossover
  • implementation of insertion

Travelling Salesman

To further practice, try implementing GA for the TS problem!

What are the similarities and differences among the algorithms?

  • individual representation
  • fitness function
  • random initialization
  • local/global approaches

Which search algorithm to use?

No Free Lunch Theorem says that the average performance of all search algorithms on all optimization problems is equal!

i.e. GA performs no better than pure random search when applied to all problems!

So again - which search algorithm to use?

We can always start by simpler algorithms, e.g. hill climbing, and see if they are good enough for the problem at hand.

If the search landscape is perturbated, more advanced algorithms which can effectively explore the landscape may be a better start.

REFERENCES